Analysing the heat vulnerability within the Boton Metropolitan Region

Introduction

Extreme heat is considered to be a chronic climate hazard that will influence the Boston’s climate change throughout the 21st century. Average summer temperature is expected to increase from 69 degrees Fahrenheit during 1980-2010 to 76 degrees Fahrenheit by 2050 with more days of extreme heat. The Boston urban area tends to be hotter than its surrounding suburban and rural areas because of the urban heat island effect.

The purpose of this analysis is to determine the overall vulnerability of ZCTAs within the Boston Metropolitan Region, considering the following factors:

  • Population below the poverty level
  • Population over 25 years with less than a High School degree
  • Age-dependency ratio indicating number of young children and older adults dependent on the working population
  • Population that is non-White
  • Mean land surface temperature
  • Tree canopy cover

The first four factors indicate socio-economic vulnerability of the ZCTAs and the last two indicate environmental burdens.

Data Used:

Vector Shapefiles(.shp)

  • Massachusetts MPO boundaries: boston-data/RAW_DATA/Vectors/MPO_Boundaries.shp
  • Massachusetts ZCTA boundaries: boston-data/RAW_DATA/Vectors/mass_zcta.shp

American Community Survey(2019) Spreadsheets(.csv)

  • Race: boston-data/RAW_DATA/ACS_Spreadsheets/ACSDT5Y2019.B02001_data_with_overlays.csv
  • Age-Dependency Ratio: boston-data/RAW_DATA/ACS_Spreadsheets/ACSST5Y2019.S0101_data_with_overlays.csv
  • Educational Attainment: boston-data/RAW_DATA/ACS_Spreadsheets/ACSST5Y2019.S1501_data_with_overlays.csv
  • Poverty: boston-data/RAW_DATA/ACS_Spreadsheets/ACSST5Y2019.S1701_data_with_overlays.csv

Raster Images(GeoTiff)

  • Land Surface Temperature: boston-data/RAW_DATA/Rasters/LandSurfaceTemperature/lst_bostonmetro.tif
  • Tree Canopy Cover: boston-data/RAW_DATA/Rasters/TreeCanopyCover/NLCD_2016_Tree_Canopy_Boston.tif

Analysis Overview:

  1. Read in all the shapefiles, rasters and spreadsheets.
  2. Set the coordinate reference system for all vectors based on the rasters.
  3. Select the Boston MPO region and clip the ZCTA boundaries within it.
  4. Attribute join the spreadsheets with the ZCTA boundaries using the ZCTA codes. Summarize and map the fields that are use further in the analysis.
  5. Perform zonal statistics using the ZCTA shapefile to extract the mean land surface temperature(LST) and tree canopy cover(TCC) from the rasters.
  6. Create bins for classifying the demographic variables into five categories of vulnerability with 1 indicating leas vulnerability and 5 indicating highest vulnerability.
  7. Normalize the demographic and mean LST and TCC variables using mean and standard deviation. Classify the normalized values into five bins. The five bins numbered 1 to 5 act as the vulnerability scores which is used to calculate a final aggregate vulnerability for every ZCTA.
  8. Percentile rank the aggregate vulnerability and then visualize it. Report the top 80% and bottom 20% ZCTAs.

Import Dependencies

Load the MPO boundaries of Massachusetts and select the Boston Region. Save it as a new variable.

Load the ZCTA boundaries of Massachusetts and check the first few rows of the attribute table. The 'ZCTA5CE10' column will be used as the primary key for attribute joins at a later stage.

Load four different ACS spreadsheets for the demographic variables that are going to be used for the analysis. The spreadsheets have the file extension '.csv'.

The four variables are: percent of people below poverty level, age-dependency ratio, population (over 25yrs) with less than a high school degree and percentage of non-Whites. At this stage, the spreadsheets are just being imported and cleaned. Moreover, the columns that are eventually going to be used are the only ones that are being imported.

The column 'NAME' will eventually be used to perform attribute joins with the ZCTA boundary file. The last five characters of the column are going to be cropped and added to a new column named 'ID_CODE'.

Load the two rasters that are going to be used in the analysis.

The first raster is the land surface temperature (LST) raster for the Boston Metropolitan Region. Land surface temperature has been calculated using a LANDSAT-8 image. The second raster is the tree canopy cover (TCC) raster from NLCD.

Check the coordinate reference systems (CRS) for all rasters and vectors. Reproject vectors if they have a CRS different from the rasters.

Clip the ZCTA boundaries to the Boston MPO boundary

Attribute join: Join the four demographic variables with the ZCTA_boston shapefile to create four different shapefiles, based on ZCTA codes

Attribute Join: Percentage of people below poverty level

Attribute Join: Age-dependency Ratio

Attribute Join: Population (over 25yrs) with less than a high school degree

Calculate the percentage of people over 25 years with less than a high school degree. The calculation uses three columns from the edu_attain table: total population(S1501_C01_006E), people over 25 years with educational attainment below 9th grade(S1501_C01_007E) and 9th to 12th grade with no diploma(S1501_C01_008E).

Attribute Join: Percentage of Non-Whites

Calculate the percentage of non-whites. The calculation uses three columns from the race table: total population(B02001_001E) and White population(B02001_002E).

Zonal statistics on LST Raster

Zonal statistics on TCC Raster

Cateforize the demographic variables into categories

Noramlize the column using mean and standard deviation and then create bins to classify the data

Percentage of people below poverty level

Age-dependency Ratio

Population (over 25yrs) with less than a high school degree

Percentage of Non-Whites

#

LST and TCC Boston (Zonal Stat output)

#

#

Attribute join all the demographic variables based on ZCTA codes

Join the cleaned table with all demographic, LST and TCC values with the ZCTA shapefile to create the final geodataframe for the analysis

Visualization of all indicators used for the analysis

Calculate vulnerability for every ZCTA and percentile rank them

Create a new column called 'VUL_SUM' which is a sum of all vulnerability values (1=less vulnerable to 5=highly vulnerable) of all six variables used in the analysis: poverty, age-dependency ratio, educational attainment, race, mean LST and mean TCC. The highest aggregate score can be 30 and the lowest can be 6. 30 indicates high vulnerability and 6 indicates low vulnerability.

High vulnerability ZCTAs have higher rank. Percentile rank is stored in the 'VUL_SCORE' column and calculated based on the column 'VUL_SUM'.

Report Highest and Lowest ranking ZCTAs